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Yeast processing system comprising a negatively charge 
amino acid adjacent to the processing site. 



FIELD OF INVENTION 

The present invention relates to polypeptides expressed and 
processed in yeast, a DNA construct comprising a DNA sequence 
5 encoding such polypeptides, vectors carrying such DNA frag- 
ments and yeast cells transformed with the vectors, as well 
as a process of producing heterologous proteins in yeast. 

BACKGROUND OF THE INVENTION 

Yeast organisms produce a number of proteins synthesized in- 
10 tracellularly, but having a function outside the cell. Such 
extracellular proteins are referred to as secreted proteins. 
These secreted proteins are expressed initially inside the 
cell in a precursor or a pre-protein form containing a pre- 
sequence ensuring effective direction of the expressed 
15 product across the membrane of the endoplasmic ... reticulum 
(ER) . The presequence , normally named a signal peptide , is 
generally cleaved off from the desired product during trans- 
location. Once entered in the secretory pathway, the protein 
is transported to the Golgi apparatus. From the Golgi the 
20 protein can follow different routes that lead to compartments 
such as the cell vacuole or the cell membrane, or it can be 
routed out of the cell to be secreted to the external medium 
(Pfeffer, S.R. and Rothman, J.E. Ann.Rev.Bioehem, "(1987), 
829-852) . 

25 Several approaches have been suggested for the expression and 
secretion in yeast of proteins heterologous to yeast. Euro- 
pean published patent application No. 0088632A describes a 
process by which proteins heterologous to yeast are ex- 
pressed, processed and secreted by transforming a yeast or- 

30 ganism with an expression vehicle harbouring DNA encoding the 
desired protein and a signal peptide, preparing a culture of 
the transformed organism, growing the culture and recovering 



WO 90/10075 



PCT/DK90/00058 



2 

the protein from the culture medium. The signal peptide may 
be the desired proteins own signal peptide, a heterologous 
signal peptide or a hybrid of native and heterologous signal 
peptide. 

5 A problem encountered with the use of signal peptides hetero- 
logous to yeast might be that the heterologous signal peptide 
does not ensure efficient translocation and/or cleavage after 
the signal peptide. 

The Si cerevisiae MFal (a-factor) is synthesized as a prepro 
10 form of 165 amino acids comprising a 19 amino acids long sig- 
nal- or prepeptide followed by a 64 amino acids long "lea- 
der" or propeptide, encompassing three N-l inked gly cosy lat ion 
sites followed by (LysArg(Asp/Glu, Ala) 2 -30t-f actor) 4 (Kurjan, 
J. and Herskowitz, I. Cell 30 (1982), 933-943). The signal- 
15 leader part of the preproMFal has been widely employed to ob- 
tain synthesis and secretion of heterologous proteins in S. 
cerivisiae . 

Use of signal/leader peptides homologous to yeast is known 
from a. o. US patent specification No. 4 , 546 , 082 , European 
20 published patent applications Nos. 0116201A, 0123294A, 
0123544A, 0163529A, and 0123289A and DK patent specifications 
Nos. 2484/84 and 3614/83. 

# 

In EP 0123289A utilization of the cerevisiae a-factor pre- 
cursor is described whereas DK 2484/84 describes utilization 
25 of the Saccharomvces cerevisiae invert ase signal peptide and 
DK 3614/83 utilization of the Saccharomvces cerevisiae PH05 
signal peptide for secretion of foreign proteins. 

US patent specification No. 4,546,082, EP 0016201A, 0123294A, 
0123544A, and 0163529A describe processes by which the a-fac- 
30 tor signal -leader from Saccharomyces cerevisiae (MFal or 
MFa2) is utilized in the secretion process of expressed het- 
erologous proteins in yeast. By fusing a DNA sequence en- 
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coding the cerevisiea MFotl signal/leader sequence at the 
5 1 end of the gene for the desired protein secretion and pro- 
cessing of the desired protein was demonstrated. 

EP 206,783 discloses a system for the secretion of polypep- 
5 tides from serevisiae whereby the a-f actor leader sequence 
has been truncated to eliminate the four a-factor peptides 
present on the native leader sequence so as to leave the 
leader peptide itself fused to a heterologous polypeptide via 
the a-f actor processing site LysArgGluAlaGluAla. This con- 

10 struction is indicated to lead to an efficient process of 
smaller peptides (less than 50 amino acids) . Fot the se- 
cretion and processing of larger polypeptides, the native a- 
factor leader sequence has been truncated to leave one or two 
a-factor peptides between the leader peptide and the polypep- 

15 tide. 

A number of secreted proteins are routed so as to be exposed 
to a proteolytic processing system which can cleave the pep- 
tide bond at the carboxy end of two consecutive basic amino 
acids. This enzymatic activity is in §^ cerevisiae encoded by 
20 the KEX 2 gene (Julius, D.A. et al., Cell 37 (1984b) , 1075). 
Processing of the product by the KEX 2 gene product is needed 
for the secretion of active §^ cerevisiae mating factor al 
(MFal or a-factor) but is not involved in the secretion of 
active £L. cerevisiae mating factor a. 

25 Secretion and correct processing of a polypeptide intended to 
be secreted is obtained in some cases when culturirig a yeast 
organism which is transformed with a vector constructed as 
indicated in the references given above. In many cases, how- 
ever, the level of secretion is very low or there is no se- 

30 cretion, or the proteolytic processing may be incorrect or 
incomplete. The present inventors currently believe this to 
be ascribable, to some extent, to an insufficient exposure of 
the processing site present between the C-terminal end of the 
leader peptide and the N-terminal end of the heterologous 
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protein so as to render it inaccessible or, at least, less 
accessible to proteolytic cleavage. 

SUMMARY OF THE INVENTION 

It has surprisingly been found that by providing certain 
5 modifications near the processing site at the C-tenninal end 
of the leader peptide and/or the N-terminal end of a hetero- 
logous polypeptide fused to the leader peptide, it is poss- 
ible to obtain a higher yield of the correctly processed pro- 
tein than is obtainable with unmodified leader peptide-het- 
10 erologous polypeptide constructions. 

Accordingly, the present invention relates to a polypeptide 
comprising a fusion of a signal peptide, a leader peptide and 
a heterologous protein or polypeptide, which polypeptide is 
modified in its amino acid sequence adjacent to a yeast pro- 
15 cessing site positioned between the C-tenninal end of the 
leader peptide and the N-terminal end of the heterologous 
protein so as to provide a presentation of the processing 
site which makes it accessible to proteolytic cleavage, the 
polypeptide having the following structure 

20 signal peptide-leader peptide-xl-x2- X 3-x4-heterologous pro- 
tein 

wherein X* is a peptide bond or represents one or more amino 
acids which may be the same or different, 

X2 and X 3 are the same or different and represent a basic 
25 amino acid selected from the group consisting of Lys and Arg, 
X 2 and X 3 together defining a yeast processing site, and 
X 4 is a peptide bond or represents one or more amino acids 
which may be the same or different, 

with the proviso that X* and/or X« represent one or more 
30 amino acids and that at least one of the amino acids repre- 
sented by Xl and/or X* is a negatively charged amino acid se- 
lected from the group consisting of Glu and Asp. 
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In the present context, the term "signal peptide" is under- 
stood to mean a presequence which is predominantly hydropho- 
bic in nature and present as an N-terminal sequence on the 
precursor form of an extracellular protein expressed in 
5 yeast. The function of the signal peptide is to allow the he- 
terologous protein to toe secreted to enter the endoplasmic 
reticulum. The signal peptide is normally cleaved off in the 
course of this process. The signal peptide may be hetero- 
logous or homologous to the yeast organism producing the pro- 
10 tein but, as explained above, a more efficient cleavage of 
the signal peptide may be obtained when it is homologous to 
the yeast organism in question. 

The expression "leader peptide" is understood to indicate a 
predominantly hydrophilic peptide whose function is to allow 

15 the heterologous protein to be secreted to be directed from 
the endoplasmic reticulum to the Golgi apparatus and further 
to a secretory vesicle for secretion into the medium, (i.e. 
exportation of the expressed protein or polypeptide across 
the cell wall or at least through the cellular membrane into 

20 the periplasmic space of the cell) . 

The expression "heterologous protein or polypeptide" is in- 
tended to indicate a protein or polypeptide which is hot pro- 
duced by the host yeast organism in nature. The terms "pro- 
tein" and "polypeptide" are used substantially interchange- 
25 ably in the following description. 

The modification of the polypeptide at the processing site 
(the site at which the leader peptide is removed from the 
heterologous protein by proteolytic cleavage provided by 
yeast proteolytic enzymes) , which modification is represented 
30 by Xl and/or X< may be in the form of an extension, substitu- 
tion or deletion of one or more amino acids at the C-terminal 
end of the leader peptide and/or at the N-terminal end of the 
heterologous protein. 
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In accordance with the invention, it has surprisingly been 
found that a more efficient and correct processing of the 
heterologous protein may be obtained when at least one of the 
amino acids with which the sequence of the polypeptide has 
5 been extended or by which one or more of the native amino 
acids in the sequence have been substituted is a negatively 
charged amino acid such as the ones indicated above. A simi- 
lar effect is observed when a negatively charged amino acid 
is provided in the proximity to the processing site by dele- 
10 tion, i.e. by deleting one or more amino acids from the C- 
terminal end of the leader or from the N-terminal end of the 
protein sequence until a negatively charged amino acid is 
present adjacent to the processing site. 

Without wishing to be limited to any particular theory, it is 
15 assumed that this effect may be ascribed to the increased 
hydrophilicity imposed by the negatively charged amino acids 
adjacent to the processing site, which results in enhanced 
exposure of the tertiary structure of the polypeptide at the 
processing site in the aqueous intracellular environment and 
20 therefore more accessible to proteolytic cleavage by the pro- 
cessing enzyme. The advantageous effect of negatively charged 
amino acids, in contrast to positively charged or neutral 
amino acids, may be ascribed to the negatively charged side 
chains of these amino acids which contribute to the hydrophi- 
25 licity of the tertiary structure at the processing site with- 
out giving rise to any potential inhibition of the processing 
enzyme. 

It is also possible that negatively charged amino acids con- 
tribute to creating and maintaining a tertiary structure at 
30 the processing site (e.g. turns, hairpins or loop structures) 
by other means such as by interaction with other amino acid 
residues in the polypeptide. Furthermore, direct interaction 
between negatively charged amino acids and the processing en- 
zyme could also take place. Thus, it is believed that nega- 
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tively charged amino acids positioned adjacent to the two 
basic amino acids of the processing site may direct the pro- 
cessing enzyme to carry out a correct and efficient cleavage 
by charge interactions between proteins and/or between pro- 
5 tern and solvent. 

in another aspect, the present invention relates to a DNA 
construct which comprises a DNA sequence encoding the poly- 
peptide defined above. . 

in a further aspect, the invention relates to a recombinant 
10 expression vector which is capable of replicating in yeast 
and which carries a DNA construct encoding the above-defined 
polypeptide, as well as a yeast strain which is capable of 
expressing the heterologous protein or polypeptide and which 
is transformed with this vector. 

15 in a still further aspect, the invention relates to a process 
for producing a heterologous protein or polypeptide in yeast, 
comprising cultivating the transformed yeast strain in a 
suitable medium to obtain expression and secretion of the 
heterologous protein or polypeptide, after which the protein 

20 or polypeptide is isolated from the medium. 

DETAILED DISCLOSURE OP THE INVENTION 

Consistent with the explanation given above, when X* repre- 
sents a single amino acid, this is Glu or Asp. 

X 1 and/or X< suitably represent 1-6 amino acids. 

25 When Xl represents a sequence of two amino acids, it may have 
the structure BA, wherein A is Glu or Asp, and B is Glu, Asp, 
Val, Gly or Leu, for instance LeuGlu. 

When xl represents a sequence of three amino acids, it may 
have the structure CBA, wherein A and B are as defined above! 
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and C is Glu, Asp, Pro, Gly, Val, Leu, Arg or Lys, for in- 
stance AspLeuGlu. 

When X 1 represents a sequence of Bore than two amino acids, 
one of the additional amino acids may suitably be Pro or Gly 
5 as these amino acids are known to introduce and/or form part 
of turns, hairpins and loop structures likely to facilitate 
the accessibility of the processing site to the proteolytic 
enzyme • 

When X 1 represents a sequence of four amino acids, it may 
10 have the structure DCBA, wherein A, B and C are as defined 
above, and D has the same meanings as C, for instance 
GluArgLeuGlu , LysGluLeuGlu or LeuAspLeuGlu. 

When X 1 represents a sequence of five amino acids, it may 
have the structure EDCRA, wherein A, B, C and D are as de- 
15 fined above, and E has the same meanings as C, for instance 
LeuGluArgLeuGlu . 

When X 1 represents a sequence of six amino acids, it may have 
the structure FEDCBA, wherein A, B, C, D and E are as defined 
above, and F has the same meanings as C, for instance 
20 ValLeuGluArgLeuGlu. 

It will be understood that other combinations of amino acids 
are possible as a meaning of X 1 without departing from the 
scope of the invention, provided that at least one of the 
amino acids in the sequence is a negatively charged amino 
25 acid, as explained above. 

Suitable meanings of X 4 may be the ones shown above for X 1 , 
though the order of the amino acids will typically be re- 
versed (i.e. ABC rather than CBA, etc.)- 



In embodiments of the polypeptide of the invention where the 
30 heterologous protein is initiated by one or more positively 
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charged or hydrophobic amino acids, x 4 advantageously re- 
presents one or more amino acids rather than a peptide bond 
as this is thought to ensure a greater accessibility of the 
processing site to proteolytic enzymes. 

5 In cases where X 4 represents an N-terminal extension of 1-6 
amino acids, it may be suitable to provide an additional pro- 
cessing site between the amino acid or acids represented by 
X 4 and the N-terminal end of the heterologous protein. This 
is particularly important when the protein is to be used for 

10 purposes requiring the presence of a protein with no N-ter- 
minal extension. The additional N-terminal amino acids may 
be removed in vitro by proteolytic cleavage by means of a 
suitable proteolytic enzyme, such as a trypsin-like protease, 
or by means of treatment with a chemical such as cyanogen 

15 bromide. It may also be possible to effect cleavage at the 
additional processing site by the host yeast organism, se- 
lecting a processing site specific for another yeast proteo- 
lytic enzyme. 

For some purposes, however, it may be advantageous to design 
20 the N-terminal extension to fit a specific purpose. Thus, the 
extension may serve as a marker for detection of the protein, 
as an aid to purify the protein or as a means to control the 
action of a pharmaceutical An vjvo, e.g. to prolong the half- 
life of a drug or to target it to a specific location in the 
25 body. 

In a preferred embodiment of the polypeptide of the present 
invention, X l and/or X 4 represent an amino acid sequence of 
1-4 amino acids. In this embodiment, the amino acid immedi- 
ately adjacent to X 2 is preferably Glu or Asp, the amino acid 
30 immediately adjacent to X 3 is preferably Glu or Asp, or both 
are Glu or Asp, as this provides a favourable presentation of 
the processing site due to the hydrophilic nature of these 
amino acids, as explained in more detail above. The amino 
acid sequence represented by X* or X 4 may suitably comprise 
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more than one Glu or Asp. 

In another interesting embodiment of the polypeptide of the 
invention, X 1 and X 4 both represent one or more amino acids, 
in other words, the polypeptide is modified both at the C- 
5 terminal end of the leader peptide and at the N-terminal end 
of the heterologous protein, in this embodiment, X 1 and X 4 
may be symmetrically identical, that is, the amino acid or 
acids represented by X 1 and X 4 are the same extending out- 
wards from X 2 and X 3 , respectively. 

10 The signal peptide sequence of the polypeptide of the inven- 
tion may be any signal peptide which ensures an effective di- 
rection of the expressed polypeptide into the secretory path- 
way of the cell. The signal peptide may be a naturally oc- 
curring signal peptide or functional parts thereof, or it may 

15 be a synthetic peptide. Suitable signal peptides have been 
found to be the a-factor signal peptide, the signal peptide 
of mouse salivary amylase, a modified carboxypeptidase sig- 
nal peptide or the yeast BARl signal peptide. The mouse sa- 
livary amylase signal sequence is described by o. Hagenbuchle 

20 et al., yfoture 289, 1981, pp. 643-646. The carboxypeptidase 
signal sequence is described by L.A. Vails et al., Cell 48 . 
1987, pp. 887-897. The BARl signal peptide is disclosed in WO 
87/02670 . 

The leader peptide sequence of the polypeptide of the inven- 
25 tion may be any leader peptide which is functional in direct- 
ing the expressed polypeptide to the endoplasmic reticulum 
and further along the secretory pathway. Possible leader se- 
quences which are suited for this purpose are natural leader 
peptides derived from yeast or other organisms, such as the 
30 the a-factor leader or a functional analogue thereof. The 
leader peptide may also be a synthetic leader peptide, e.g. 
one of the synthetic leaders disclosed in International 
Patent Application, Publication No. WO 89/02463 with the fol- 
lowing amino acid sequences 
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B 

5 



D 

10 



E. 



Ala-Pro-Val-Thr-Gly-Asp-Glu-Ser-Ser-Val-Glu-Ile 

Pro-Glu-Glu-Ser-Leu-Ile-Gly-Phe-Leu-Asp-Leu-Ala- 
Gly-Glu-Glu-Ile-Ala-Glu-Asn-Thr-Thr-Leu-Ala 

f 

Ala-Pro-Val-Thr-Gly-Asp-Glu-Ser-Ser-Val-Glu-Ile- 

Pro-Glu-Glu-Ser-Leu-Ile-Ile-Ala-Glu-Asn-Thr-Thr- 
Leu-Ala 

Ala-Pro-Val-Thr-Gly-Asp-Glu-Ser-Ser-Val-Giu-Ile- 
Pro-Ile-Ala-Glu-Asn-Thr-Thr-Leu-Ala 

Ala-Pro-Val-Thr-Gly-Asp-Glu-Ser-Ser-Val-Glu-ne- 

Pro-Glu-Glu-Ser-Leu-Ile-Ile-Ala-Glu-Asn-Thr-Thr- 
Leu-Ala-Asn-Val-Ala-Met-Ala 'and 

Gln-Pro-Val-Thr-Gly-Asp-Glu-Ser-Ser-Vai-Glu-Ile- 

Pro-Glu-Glu-Ser-Leu-Ile-Ile-Ala-Glu-Asn-Thr-Thr- 
Leu-Ala-Asn-Val-Ala-Met-Ala 

15 or a derivative thereof. 

The heterologous protein produced by the method of the inven- 
tion may be any protein which may advantageously be produced 
in yeast. Examples of such proteins are aprotinin or other 
protease inhibitors, insulin (including insulin precursors) 

20 human or bovine growth hormone, interleukin, glucagon, tissue 
plasminogen activator, Factor vxi. Factor vjii, Factor XIII 
platelet-derived growth factor, enzymes, etc., or a func- 
tional analogue thereof, m the present context, the term 
"functional analogue" is meant to indicate a polypeptide with 

25 a similar function as the native protein (this is intended to 
be understood as relating to the nature rather than the level 
of biological activity of the native protein). The 
polypeptide may be structurally similar to the native 
protein and may be derived from the native protein by addi- 

30 tion of one or more amino acids to either or both the C- and 
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N-terminal end of the native protein, substitution of one or 
more amino acids at one or a number of different sites in the 
native amino acid sequence, deletion of one or more amino 
acids at either or both ends of the native protein or at one 
5 or several sites in the amino acid sequence, or insertion of 
one or more amino acids at one or more sites in the native 
amino acid sequence. Such modifications are well known for 
several of the proteins mentioned above. 

According to the invention, it has surprisingly been found 

10 that modifications at the C-terminal end of the leader pep- 
tide or at the N-terminal end of the heterologous protein as 
described above allow for the production in high yields of 
correctly processed aprotinin (or a functional analogue 
thereof, as defined above). Aprotinin is a protease-inhibit- 

15 ing protein whose properties make it useful for a variety of 
medical purposes (e.g. in the treatment of pancreatitis, sep- 
tic shock syndrome, hyperfibrinolytic haemorrhage and myocar- 
dial infarction). Administration of aprotinin in high doses 
significantly reduces blood loss in connection with cardiac 

20 surgery or other major surgery. Aprotinin is also useful as 
an additive to culture media as it inhibits host cell prote- 
ases which might otherwise cause unwanted proteolytic cleav- 
age of expressed proteins. Difficulties have been experienced 
in obtaining a high yield of correctly processed aprotinin in 

25 yeast using known natural leader sequences such as the a-fac- 
tor leader. Native aprotinin is initiated at the N-terminal 
by a basic amino acid (Arg) , and for this reason the preced- 
ing processing site may be less apt for proteolytic cleavage 
(as explained above) when one of the known leader peptides 

30 (e.g. the a-factor leader) is employed without being modified 
according to the present invention, resulting in low yields, 
if any, of correctly processed aprotinin. 

According to the present invention, particularly good results 
have been achieved with respect to the production of aproti- 
35 nin when x 1 represents GluArgLeuGlu or when X 4 represents 
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GluLeu or GluLeuAspLeu (cf. the following examples), although 
it is expected than advantageous yields of aprotinin may be 
obtained with other meanings of X* and X*, provided that at 
least one amino acid, and most preferably the one closest to 
5 the processing site is a negatively charged amino 1 acid, as 
explained above. 

Also according to the invention, it has been found that modi- 
fications at the C-terminal end of the leader peptide or at 
the N-terminal end of the heterologous protein as described 

10 above permit production of high yields of correctly processed 
insulin precursor (or a functional analogue thereof as de- 
fined above) . In this embodiment of the invention, may re- 
present Glu-Arg-Leu-Glu or Lys-Glu-Leu-Glu in which case X* 
usually represents a peptide bond. Alternatively, x* may re- 

15 present Glu in which case x* usually respresents a peptide 
bond. In particular, advantageous results have been achieved 
according to the invention with respect to the expression of 
the insulin analogue precursor B(l-29) -AlaAlaLys-A(l-29) when 
X represents LysGluLeuGlu or when X< represents a substitu- 

20 tion of Phe as the first amino acid of the insulin precursor 
by Glu (cf. the following examples). Reasonable yields have 
also been obtained of the insulin analogue precursor B(l- 
29)SerAspAspAlaLys-A(l-29) when X* represents Lys-Glu-Leu- 
Glu. Furthermore, it is expected that high yields of insulin 

25 precursor may be obtained with other meanings of X* and X< 
provided that at least one amino acid, and most preferably' 
the one closest to the processing site is a negatively 
charged amino acid, as explained above. 

The DNA construct of the invention encoding the polypeptide 
30 of the invention may be prepared synthetically by established 
standard methods, e.g. the phosphoamidite method described by 
S.L. Beaucage and M.H. Caruthers, Tetoh^n ^ 
1981, pp. 1859-1869, or the method described by Matthes et 
al., PMBO Journql 3 , i 984 , pp. 801-805. According to the 
35 phosphoamidite method, oligonucleotides are synthesized 
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e.g. in an automatic DNA synthesizer, purified, duplexed and 
ligated to form the synthetic DNA construct. 

The DNA construct of the invention may also be of genomic or 
cDNA origin, for instance obtained by preparing a genomic or 
5 cDNA library and screening for DNA sequences coding for all 
or part of the polypeptide of the invention by hybridization 
using synthetic oligonucleotide probes in accordance with 
standard techniques (cf. T. Maniatis et al., Molecular Clon- 
ing: A Laboratory Manual, cold Spring Harbor, 1982) . In this 

10 case, a genomic or cDNA sequence encoding a signal and leader 
peptide may be joined to a genomic or cDNA sequence encoding 
the heterologous protein, after which the DNA sequence may be 
modified at a site corresponding to the amino acid sequence 
xl -X 2 -X 3 -X 4 of the polypeptide, e.g. by site-directed mutage- 

15 nesis using synthetic oligonucleotides encoding the desired 
amino acid sequence for homologous recombination in accord- 
ance with well-known procedures. 

Finally, the DNA construct may be of mixed synthetic and ge- 
nomic, mixed synthetic and cDNA or mixed genomic and cDNA 

20 origin prepared by annealing fragments of synthetic, genomic 
or cDNA origin (as appropriate), the fragments corresponding 
to various parts of the entire DNA construct, in accordance 
with standard techniques. Thus, it may be envisaged that the 
DNA sequence encoding the heterologous protein may be of ge- 

25 nomic origin, while the sequence encoding the leader peptide 
may be prepared synthetically. 

Preferred DNA constructs encoding aprotinin are as shown in 
the appended figures 4, 7, 9, 11 and 12, or suitable modifi- 
cations thereof coding for aprotinin or a functional analogue 
30 thereof. Examples of suitable modifications of the DNA se- 
quence are nucleotide substitutions which do not give rise to 
another amino acid sequence of the protein, but which may 
correspond to the codon usage of the yeast organism into 
which the DNA construct is inserted or nucleotide substitu- 
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tions which do give rise to a different amino acid sequence 
and therefore, possibly, a different protein structure with- 
out, however, impairing the anti-protease properties of na- 
tive aprotinin. Other examples of possible modifications are 
5 insertion of one or more nucleotides into the sequence, addi- 
tion of one or more nucleotides at either end of the sequence 
and deletion of one or more nucleotides at either end of or 
within the sequence. Examples of -specific aprotinin analogues 
are those described in European Patent Application, Publica- 
10 tion No. 339 942. 

Preferred DNA constructs encoding insulin precursors are as 
shown in the appended figures 16 and 17 or suitable modifica- 
tions thereof, as defined above. 

The recombinant expression vector carrying a DNA sequence en- 
15 coding the polypeptide of the invention may be any vector 
which is capable of replicating in yeast organisms, in the 
vector, the DNA sequence encoding the polypeptide of the in- 
vention should be operably connected to a suitable promoter 
sequence. The promoter may be any DNA sequence which shows 
20 transcriptional activity in yeast and may be derived from 
genes encoding proteins either homologous or heterologous to 
yeast. The promoter is preferably derived from a gene en- 
coding a protein homologous to yeast. Examples of suitable 
promoters are the Saccharom yces e«n«H <H «o Mai, TPI, ADH or 
25 PGK promoters. 

t 

The DNA sequence encoding the polypeptide of the invention 
should also be operably connected to a suitable terminator, 
e.g. the TPI terminator (cf. T. Alber and G. Kawasaki, ^ 

Mol. AddI. Genet. l r 1982, pp. 419-434). 

30 The recombinant expression vector of the invention further 
comprises a DNA sequence enabling the vector to replicate in 
yeast. Examples of such sequences are the yeast plasmid 2/i 
replication genes REP 1-3 and origin of replication. The vec- 
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tor may also comprise a selectable marker, e.g. the Schizo- 
saccharomvces pombe TPI gene as described by P.R. Russell, 
Gene 40 . 1985, pp. 125-130. 

The procedures used to ligate the DNA sequences coding for 
5 the polypeptide of the invention, the promoter and the ter- 
minator, respectively, and to insert them into suitable yeast 
vectors containing the information necessary for yeast re- 
plication, are well known to persons skilled in the art (cf., 
for instance, Haniatis et al., op.cit. i . it will be under- 

10 stood that the vector may be constructed either by first pre- 
paring a DNA construct containing the entire DNA sequence 
coding for the polypeptide of the invention and subsequently 
inserting this fragment into a suitable expression vector, or 
by sequentially inserting DNA fragments containing genetic 

15 information for the individual elements (such as the signal, 
leader or heterologous protein) followed by ligation. 

The yeast organism used in the process of the invention may 
be any suitable yeast organism which, on cultivation, pro- 
duces large amounts of the heterologous protein or polypep- 

20 tide in question. Examples of suitable yeast organisms may be 
strains of the yeast species Saccharomvces cerevisiae r sac- 
charomyces kluyveri , SchizoBa ccharomyces pombe or Saccharo- 
mvces uyarum. The transformation of the yeast cells may for 
instance be effected by protoplast formation (cf. Example l 

25 below) followed by transformation in a manner known per se . 
The medium used to cultivate the cells may be any conven- 
tional medium suitable for growing yeast organisms. The se- 
creted heterologous protein, a significant proportion of 
which will be present in the medium in correctly processed 

30 form, may be recovered from the medium by conventional pro- 
cedures including separating the yeast cells from the medium 
by centrifugation or filtration, precipitating the protein- 
aceous components of the supernatant or filtrate by means of 
a salt, e.g. ammonium sulphate, followed by purification by a 

35 variety of chromatographic procedures, e.g. ion exchange 
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chromatography, affinity chromatography, or the like. 

in an additional aspect, the invention relates to a novel 
aprotinin analogue of the general formula X 4 -aprotinih(l-58) , 
wherein X 4 represents an N-terminal extension by one or more 
5 amino acids at least one of which is a negatively charged 
amino acid selected from the group consisting of Glu and Asp. 
X 4 may represent a sequence of 1-6 amino acids, in particular 
1-4 amino acids, and may have any of the meanings given 
above. Particularly preferred meanings of X 4 are Glu-Leu and 
10 Glu-Leu -Asp-Leu . 

BRIEF DESCRIPTION OF THE DRAWINGS 

The invention is further disclosed in the following examples 
with reference to the appended drawings, wherein 

Fig. l shows the DNA and amino acid sequence of aprotinin (1- 
15 58) . The staggered lines shown in the DNA sequence denote 
sites at which duplexes formed from synthetic oligonucleo- 
tides were ligated. 

Fig. 2 shows the construction of plasmid pKFN-802 and pKFN- 

803. 



20 Fig. 3 shows the construction of plasmid pKFN-849 and pKFN- 
855. 



Fig. 4 shows the DNA sequence of a 406 bp EcoRI-Xbal fragment 
from pKFN-849 and pKFN-855. The arrow denotes the site at 
which proteolytic cleavage takes place during secretion. 

25 Fig. 5A and 5B show the inhibition of trypsin and plasmin, 
respectively, by aprotinin; t denotes bovine pancreatic apro- 
tinin, x denotes GluLeu-aprotinin and £ denotes GluLeuAspLeu- 
aprotinin . 



WO 90/10075 



PCT/DK90/00058 



18 

Fig* 6 shows the construction of plasmids pKFN-852 and pKFN- 
858. 

Pig. 7 shows the DNA sequence of a 412 bp EcoRI-Xbal fragment 
from pKFN-852 and pKFN-858. The arrow denotes the site at 
5 which proteolytic cleavage takes place during secretion. 

Fig. 8 shows the construction of plasmid pKFN-995 and pKFN- 
998. 

Fig. 9 shows the DNA sequence of a 412 bp EcoRI-Xbal fragment 
from pKFN-995 and pKFN-998. The arrow denotes the site of 
10 proteolytic cleavage during secretion. 

Fig. 10 shows the construction of plasmids pKFN-1000 and 
pKFN-1003. 

Fig. 11 shows the DNA sequence of the 412 bp EcoRI-Xbal frag- 
ment from pKFN-1000 and pKFN-1003. The arrow denotes the site 
15 at which proteolytic cleavage takes place during secretion. 

Fig. 12 shows the DNA sequence of a synthetic aprotinin(3-58) 
gene. The staggered lines within the sequence indicate the 
sites where five duplexes formed from 10 synthesized oligo- 
nucleotides are ligated. 

20 Fig. 13 shows the construction of plasmid pKFN-305 and pKFN- 

374/375. 

Fig. 14 shows the construction of plasmid pMT-636* 

Fig. 15 shows the construction of plasmid pLaC-240. 

Fig. 16 shows the DNA sequence of the modified leader-insulin 
25 precursor gene from pLaC-240. 
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Fig. 17 shows the DNA sequence of a 508 bp EcoRI-Xbal frag- 
ment from pKFN-458 encoding the MFal-signal-ieade"r(l-85) and 

insulin analogue precursor B(l-29,lGlu+27Glu)-AlaAlaLys-A(l- 
21) . 



5 EXAMPLES 




a) Construction of plasmid pKFN-802 

10 A synthetic gene coding for aprotinin(l-58) was constructed 
from 10 oligonucleotides by ligation. 

The oligonucleotides were synthesized on an automatic DNA 
synthesizer using phosphoramidite chemistry on a controlled 
pore glass support (Beaucage, S.L., and Caruthers, M.H., 2§r 

15 trahedron letters 22_, (1981) 1859-1869) . 

The following 10 oligonucleotides were synthesized: 

* 

NOR-760 : CATGGCCAAAAGAAGGCCTGATTTCTGTTTGGAACCTCCATACACTGGTCC 
NOR-754 : TTACATGGACCAGTGTATGGAGGTTCCAAACAGA^^ 
GGC 

20 NOR-354: ATGTAAAGCTAGAATCATCAGATACTTCTACAACG 
NOR-355: CTTGGCGTTGTAGAAGTATCTGATGATTCTAGCT 
NOR-356 : CCAAGGCTGGTTTGTGTCAAACTTTCGTTTACGGTGGCT 
NOR-357 : CTCTGCAGCCACCGTAAACGAAAGTTTGACACAAACCAGC 
NOR-358: GCAGAGCTAAGAGAAACAACTTCAAGT 

25 NOR-359: AGCAGACTTGAAGTTGTTTCTCTTAG 

NOR-3 60 : CTGCTGAAGACTGCATGAGAACTTGTGGTGGTGCCTAAT 
NOR-361: CTAGATTAGGCACCACCACAAGTTCTCATGCAGTCTTC 

5 duplexes A - E were formed from the above 10 oligonucleo- 
tides as shown in Pig. l. 
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20 pmole of each of the duplexes A - E were formed from the 
corresponding pairs of 5 1 -phosphorylated oligonucleotides by 
heating for 5 min. at 90 °C followed by cooling to room tem- 
perature over a period of 75 minutes. The five duplexes were 
5 mixed and treated with T 4 DNA ligase. The synthetic gene was 
isolated as a 191 bp band after electrophoresis of the liga- 
tion mixture on a 2% agarose gel. The obtained synthetic gene 
is shown in Fig. 1. 

The synthetic gene was ligated to a 209 bp EcoRl-Ncol frag- 
10 ment from pLaC212spx3 and to the 2.7 Kb EcoRI-Xbal fragment 
of plasmid pUC19 (Yanisch-Perron, C, Vieira, J. and Messing, 
J., Gene 33 (1985), 103-119). Plasmid pLaC212spx3 is de- 
scribed in Example 3 of International Patent Application, 
Publication No. WO 89/02463. 

15 The 209 bp EcoRI-Ncol fragment from pLaC212spx3 encodes a 
synthetic yeast leader peptide. 

The ligation mixture was used to transform a competent £^ 
coli strain r", m+) selecting for ampicillin resistance. Se- 
quencing of a 32 P-XbaI-EcoRI fragment (Maxam, A. and Gilbert, 
20 w * ' methods EpzymoT, » 65 (1980) , 499-560) showed that plasmids 
from the resulting colonies contained the correct DNA se- 
quence for aprotinin(l-58) . 

One plasmid pKFN802 was selected for further use. The con- 
struction of plasmid pKPN802 is illustrated in Fig. 2. 

25 b) Construction of plasmids pKFN-849, pKFN-855 and yeast 
strain KFN-837. 

The 3.0 kb NcoI-StuI fragment of pKFN-802 was ligated to the 
synthetic fragment NOR-790/791 with T 4 DNA ligase: 
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M A K R E L R 

CATGGCTAAGAGAGAATTGAGA 
CGATTCTCTCTTAACTCT 

The ligation mixture was digested with the restriction enzyme 
5 Stul in order to reduce any background of pKFN-802, and the 
resulting mixture was used to transform a competent E^. coli 
strain (r~, m + ) selecting for ampicillin resistance. Plasmid 
pKFN-849 from one of the resulting colonies was ^hown.by DNA 
sequencing (Sanger, F. , Micklen, S., and Coulson, A.R., 
10 ProCrNatjL.Acad^Sci,. USA 2A (1977) , 5463-5467) to contain the 
DNA sequence for Glu-Leu-Aprotinin(l-58) correctly fused to 
the synthetic yeast leader gene. The construction of plasmid 
pKFN-849 is illustrated in Fig. 3. 

pKFN-849 was cut with EcoRI and Xbal and the 406 bp fragment 
15 was ligated to the. 9. 5 kb Ncol-Xbal fragment from pMT636 and 
the 1.4 kb NcoI-EcoRI fragment from pMT636, resulting in 
plasmid pKFN-855, see Fig. 3. Plasmid pMT636 is described in 
International Patent Application No. PCT/DK88/00138. 

PMT636 is an L coli - s_s. cerevista a shuttle vector contain- 
20 ing the Schizosaccharomvcgg pomba TPI gene (POT) (Russell 
P.R., Gene ±0 (1985), 125-130), the S_. cerevisiae triose- 
phosphate isomerase promoter and terminator, TPI P and tpi t 
(Alber, T. , and Kawasaki, G. J.Mol.App l .a^. ± (1982) , 419- 
434) . Plasmid pKFN-855 contains the following sequence: 

25 TPI p -LaC212spx3 signal-leader-Glu-Leu-aprotinin(l-58) -TPI T 

where LaC212spx3 signal-leader is the synthetic yea^t leader 
described in International Patent Application, publication 
No. WO 89/02463. The DNA sequence of the 406 bp EcpRI-Xbal 
fragment from pKFN-849 and pKFN-855 is shown in Fig. 4. 

30 S^. cereyjsjae strain MT663 (E2-7B XE11-36 a/a, Atpi Atpi, pep 
4-3/pep 4-3) was grown on YPGaL (i% Bacto yeast extract, 2% 
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Bacto peptone, 2% galactose, 1% lactate) to an O.D. at 600 run 
of 0.6. 

100 ml of culture was harvested by centrifugation, washed 
with 10 ml of water, recentrifugated and resuspended in 10 ml 
5 of a solution containing 1.2 M sorbitol, 25 mM Na 2 EDTA pH - 
8.0 and 6.7 mg/ml dithiotreitol . The suspension was incubated 
at 30 a C for 15 minutes, centrifuged and the cells resus- 
pended in 10 ml of a solution containing 1.2 M sorbitol, 10 
mM Na 2 EDTA, 0.1 M sodium citrate, pH = 5.8, and 2 mg 

10 Novozym(^) 234. The suspension was incubated at 30 °C for 30 
minutes, the cells collected by centrifugation, washed in 10 
ml of 1.2 M sorbitol and 10 ml of CAS (1.2 M sorbitol, 10 mM 
CaCl 2 , 10 mM Tris HC1 (Tris « Tris(hydroxymethyl) aminome- 
thane) pH = 7.5) and resuspended in 2 ml of CAS. For trans- 

15 formation 0.1 ml of CAS-resuspended cells were mixed with ap- 
prox. 1 fig of plasmid pKFN-855 and left at room temperature 
for 15 minutes. 1 ml of (20% polyethylene glycol 4000, 20 mM 
CaCl 2 , 10 mM CaCl 2 , 10 niM Tris HC1, pH = 7.5) was added and 
the mixture left for a further 30 minutes at room tempera- 

20 ture. The mixture was centrifuged and the pellet resuspended 
in 0.1 ml of SOS (1.2 M sorbitol, 33% v/v YPD, 6.7 iriM CaCl 2 , 
14 fig/wl leucine) and incubated at 30 *C for 2 hours. The sus- 
pension was then centrifuged and the pellet resuspended in 
0.5 al of 1.2 H sorbitol. Then, 6 ml of top agar (the SC me- 

25 dium of Sherman et al., f Methods in Yeast Genetics , Cold 
Spring Harbor Laboratory (1981)) containing 1.2 M sorbitol 
plus 2.5% agar) at 52 °C was added and the suspension poured 
on top of plates containing the same agar-solidified, sorbi- 
tol containing medium. Transformant colonies were picked 

30 after 3 days at 30 °C, reisolated and used to start liquid 
cultures. One such transformant KFN-837 was selected for 
further characterization. 

Yeast strain KFN-837 was grown on YPD medium (1% yeast ex- 
tract, 2% peptone (from Difco Laboratories), and 6% glucose). 
35 A 200 ml culture of the strain was shaken at 250 rpm at 30 °c 
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for 3 days to an O.D. at 600 run of 20 (Dry yeast biomass 18.8 

g/liter). After centrifugation the supernatant was analyzed 

by FPLC ion exchange chromatography. The yeast supernatant 

was filtered through a 0.22 /xm Millex GV filter unit and 1 ml 

5 was applied on a MonoS cation exchange column (0.5 x 5 cm) 

equilibrated with 20 mM Bicine, pH - 8.7. After wash with 

equilibration buffer the column was eluted with a linear MaCl 

gradient (0-1 M) in equilibration buffer. Trypsin inhibitor 

activity was quantified in the eluted fractions by spec- 

10 trophotometric assay (Kassel, B. , Methods Rngyff ni , io ( 1970 ) , 

844-852) and furthermore by integration of absorption at 280 
run from 



1% 

E (aprotinin) =8.3 

15 280 



The yield was 120 mg/liter of Glu-Leu-aprotinin(l-56) . 

For amino acid analysis and N-terminal sequencing concentra- 
tion and further purification of the gradient eluted (Glu- 
Leu-aprotinin (1-58) was accomplished by HPLC on a reversed 
20 phase column (Vydac C4, 4.6 x 250 mm). Elution was carried 
out with a CH 3 CN gradient in 0.1% TFA. The collected frac- 
tions were concentrated to about 100 /xl by vacuuV centrifuga- 
tion and samples were taken for N-terminal sequencing and 
amino acid analysis. 

25 By N-terminal sequencing the following sequence was found: 

Glu-Leu-Arg-Pro-Asp-Phe-X-Leu-Glu-Pro-Pro-Tyr-Thr-Gly-Pro-X- 
Lys-Ala-Arg-Ile-Ile-Arg-Tyr-Phe-Tyr-Asn-Ala-Lys-Ala 

confirming that the N-terminal end is correct. Half-cysteine 
residues are not determined by this method, which is in 
30 accordance with the blank cycles (X) obtained for residues 
No. 7 and 16. 
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The amino acid analysis is shown in Table l. From this table 
it appears that the product has the expected amino acid com- 
position, i.e. more Glu and Leu* The slightly lowered content 
of lie can most probably be ascribed to incomplete hydrolysis 
5 of lie (18) -He (19) (this is well known in the art). Also, Arg 
is slightly higher than expected. This is, however, also seen 
with native aprotinin (Table l , third column) . 

When compared by the above-mentioned method of Kassel the 
specific activity of Glu-Leu-aprotinin(l-58) was found to be 

10 identical within the experimental error with the specific ac- 
tivity of native aprotinin. The trypsin and plasmin inhibi- 
tion by Glu-Leu-aprotinin (1-58) in terms of titration curves 
determined as described below were indistinguishable from 
that of bovine pancreatic aprotinin (Aprotinin Novo) . After 

15 incubation of the enzyme with aprotinin or its analogues for 
30 minutes, 0.6 mM S 2251 (KabiVitrum) was added and the ac- 
tivity measured as the rate of nitroaniline production. The 
activity as a function of aprotinin concentration is plotted 
in Fig. 5A and 5B. Complete inhibition of both enzymes with 

20 all three inhibitors was observed. 

Example 2 

Production of Glu -Leu-Asn-Leu-anrotinin f 1-581 from veast 
strain KFN-840 

A synthetic gene encoding Glu-Leu-Asp-Leu-aprotinin(l-58) was 
25 constructed as described in Example 1. The synthetic fragment 
NOR-793/794 was used instead of NOR-790/791: 

MAKRELDLR 

CATGGCTAAGAGAGAATTGGACTTGAGA 
CGATTCTCTCTTAACCTGAACTCT 

30 The pUC19 derived plasmid pKFN-852 was constructed in a simi- 
lar way as pKFN-849. 
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By following the procedure of Example 1 a plasmid pKFN-858 
was obtained containing the following construction TPI P - 
LaC2l2spx3 signal-leader-GluLeuAspLeu-aprotinin(l-58) -TPI T . 

The construction of plasmids pKFN-852 and pKFN-858 is il- 
5 lustrated in Fig. 6. The DNA sequence of the 412 bp EcoRI- 
Xbal fragment from pKFN-852 and pKFN-858 is given in^Fig. 7. 

Plasmid pKFN-858 was transformed into yeast strain MT663 as 
described above resulting in yeast strain KFN-840. 

A 200 ml culture of KFN-840 in YPD medium was shaken at 250 
10 rpm at 30 °C for 3 days to an O.D. at 600 mn of 18 (dry bio- 
mass 16.7 mg/liter). FPLC ion chromatography of the superna- 
tant as described above gave a yield of 90 mg/liter of Glu- 
Leu-Asp-Leu-aprotinin ( 1-58 ) • 

The amino acid analysis appears from table 1 and confirms the 
15 expected amino acid composition. 

The trypsin and plasmin inhibition titration curves of Glu- 
Leu-Asp-Leu-aprotinin(l-58) were indistinguishable from that 
of bovine pancreatic aprotinin (Aprotinin Novo) see Fig. 5A 
and 5B. 

20 Example 3 

Production of aprotinin ( 1-581 from veast strain KFM-mnfi 

Plasmid pKFN-995 was constructed from pKFN-802 by ligation of 
the 3.0 kb NcoI-StuI fragment to the synthetic fragment NOR- 
848/849: 

25 MAKELEKRR 

CATGGCTAAGGAATTGGAGAAGAGAAGG 
CGATTCCTTAACCTCTTCTCTTCC 
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The ligation mixture was digested with the restriction enzyme 
Ball in order to reduce any background of pKFN-802. 

The yeast expression plasmid pKFN-998 was constructed sub- 
stantially as described in Example 1 and as shown in Fig. 8: 

5 TPIp-LaC212spx3 signal-leader (1-47) Lys-Glu-Leu-Glu(Lys-Arg)- 
aprotinin ( 1-58 ) -TPI T 

The DNA sequence of the 412 bp EcoRI-Xbal fragment from pKFN- 
995 and pKFN-998 is given in Fig. 9. 

Plasmid pKFN-998 was transformed into yeast strain MT663 as 
10 described in Example 1 resulting in yeast strain KFN-1006. 
Culturing of the transformed strain KFN-1006 in YPD-medium 
and analysis for aprotinin(l-58) in the supernatant was per- 
formed as described above. 

The yield was 30 mg/liter of aprotinin(l-58) . 

15 The amino acid analysis of the purified material confirmed 
the expected amino acid composition. 

Example 4 

Production of aorotininf 1-581 fr om yeast strain KFN-ioos 

The pUC-derived plasmid pKFN-1000 was constructed as de- 
20 scribed in Example 1 by ligation of the 3.0 kb Ncol-Stul 
fragment of pKFN-802 to the synthetic fragment NOR-850/851: 

MAERLEKRR 

CATG6CTGAGA6ATTGGAGAAGAGAAG6 
CGACTCTCTAACCTCTTCTCTTCC 

25 By following the procedure of Examples 1 and 3 a yeast ex- 
pression plasmid pKFN-1003 was obtained containing the fol- 
lowing construction TPI p -LaC212spx3 signal-leader (1-47)- 
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GluArgLeuGlu (Lys-Arg) -aprotinin ( 1-58 ) -TPI T . 

The construction of plasmids pKFN-1000 and pKFN-1003 is il- 
lustrated in fig. io. The DNA sequence of the 412 bp EcoRI- 

Xbal fragment from pKFN-1000 and pKFN-1003 is given in fig. 
5 11. 



Plasmid pKFN-1003 was transformed in yeast strain MT663 as 
described above resulting in yeast strain KFN-1008. 

Culturing of the transformed strain KFN-1008 in YPD-medium 
and analysis for aprotinin (1-58) in the supernatant was per- 
10 formed as described above. The yield was 120 mg/liter of 
aprotinin (1-58) . 

The amino acid analysis of purified material confirmed the 
expected amino acid composition. In addition, confirmation of 
the complete primary structure was. obtained by gas phase se- 
15 quencing of the reduced, pyridylethylated polypeptide. 

Furthermore the specific inhibitory activity of the recombi- 
nant aprotinin (1-58) against trypsin was indistinguishable 
from that of bovine pancreatic aprotinin. 

Example 5 
20 Production of aorot: 

Plasmid pKFN-802 (cf . Example l) was cut with EcoRi and xbal 
and the 400 bp fragment was ligated to the 9.5 kb Ncol-Xbal 
fragment from pMT636 and the 1.4 kb Ncol-EcoRl fragment from 
PMT636, resulting in plasmid pKFN-803, see fig. 2. pKFN-803 
25 contains the following construction: * 

TPl p -LaC2 12spx3 signal-leader-aprotinin ( 1-58 ) -TPI T 

Plasmid pKFN-803 was transformed in yeast strain MT663 as de- 
scribed above. Culturing of the transformed strain KFN-783 
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in YPD medium and FPLC ion chromatography of the supernatant 
was performed as described above. Aprotinin(l-58) was quan- 
tified by comparison to chromatography of a standardized 
solution of authentic aprotinin purified from bovine pan- 
5 creas. The yield of aprotinin (1-58) was below 1 mg/liter. 

Table 1 

Amino acid composition 







Aprotinin 


Aprotinin 


KFN-837 


KFN-840 




Amino Acid 


Theoretical 


Found 


Found 


Found 


10 


Asp 


5 


5.00 


4.95 


5.93 (+1) 




Thr 


3 


2.86 


2.84 


2.85 




Ser 


1 


0.94 


0.92 


0.93 




Glu 


3 


3.04 


3.97 (+1) 


3.99 (+1) 




Pro 


4 


4.18 


3.90 


3.86 


15 


Gly 


6 


5.95 


5.91 


5.94 




Ala 


6 


5.85 


5.92 


5.94 




Cys 


6 


5.20 


5.21 


5.22 




Val 


1 


0.99 


1.00 


1.01 




Met 


1 


0.83 


0.65 


0.67 


20 


He 


2 


1.39 


1.58 


1.58 




Leu 


2 


1.97 


3.00 (+1) 


4.00 (+2) 




Tyr 


4 


3.84 


3.74 


3.71 




Phe 


4 


3.98 


3.94 


3.92 




Lys 


4 


3.92 


3.99 


3.99 


25- 


Arg 


6 


6.39 


6.35 


6.34 




Totals 


58 


56.33 


57.87 


59.88 
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* 

Example 6 

Production of aprotininf3-58) 

A synthetic gene for aprotinin(3-58) was constructed from 
number of oligonucleotides by ligation. 

5 The following 10 oligonucleotides were synthesized as de 
scribed in Example 1: 

I : AAAGAGATTTCTGTTTGGAACCTCCATACACTGGTCC 

37-mer 

II : TTACATGGACCAGTGTATGGAGGTTCCAAACAGAAACT 

10 38-mer 

III : ATGTAAAGCTAGAATCATCAGATACTTCTACAACG 

35-mer 

IV: CTTGGCGTTGTAGAAGTATCTGATGATTCTAGCT 

34-mer 



15 V: 



VI: 



IX: 



CCAAGGCTGGTTTGTGTCAAACTTTCGTTTACGGTGGCT 

39- mer 

CTCTGCAGCCACCGTAAACGAAAGTTTGACACAAACCAGC 

40- mer 



VII: GCAGAGCTAAGAGAAACAACTTCAAGT 

20 27-mer 

VIII : AGCAGACTTGAAGTTGTTTCTCTTAG 

26-mer 



CTGCTGAAGACTGCATGAGAACTTGTGGTGGTGCCTAAT 

39-mer 
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X : CTAGATTAGGCACCACCACAAGTTCTCATGCAGTCTTC 

38-mer 

5 duplexes A - E were formed from the above 10 oligonucleo- 
tides as shown in fig. 12. 

5 20 pmole of each of the duplexes A - E was formed from the 
corresponding pairs of 5 1 -phosphorylated oligonucleotides I- 
X by heating for 5 min. at 90 °C followed by cooling to room 
temperature over a period of 75 minutes. The five duplexes 
were mixed and treated with T4 ligase. The synthetic gene was 
10 isolated as a 176 bp band after electrophoresis of the liga- 
tion mixture on a 2% agarose gel. The obtained synthetic gene 
is shown in fig. 12. 

The synthetic 176 bp gene was ligated to a 330 bp EcoRI-Hgal 
fragment from plasmid pKFN-9 coding for the S. cerevisiae 

15 mating factor al signal-leader (1-85) sequence (Markussen, J. 
et al., Protein-Engineering i (1987), 215-223) and to the 2.7 
kb EcoRI-Xbal fragment from pUC19 (Yanish-Perron, C. , Vieira, 
J. and Messing, J., Gene 33 (1985), 103-119). The construc- 
tion of pKFN-9 containing a Hgal site immediately after the 

20 MFal leader sequence is described in EP 214 826. 

The ligation mixture was used to transform a competent E. 
coli strain (r", m + ) selecting for ampicillin resistance. Se- 
quencing of a 32 P-XbaI-EcoRI fragment (Maxam, A. and Gilbert, 
W., Methods Enzymol. §S (1980), 499-560) showed that plasmids 
25 from the resulting colonies contained the correct DNA-se- 
quence for aprotinin(3-58) . 

One plasmid pKNF305 was selected for further use. The con- 
struction of plasmid pKFN305 is illustrated in fig. 13. 

PKFN305 was cut with EcoRI and Xbal and the 0.5 kb fragment 
30 was ligated to the 9.5 kb Ncol-Xbal fragment from pMT636 and 
the 1.4 kb NcoI-EcoRI fragment from pMT636, resulting in 
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plasmid pKFN374, see fig. 13. Plasmid pMT636 was constructed 
from pMT608 after deletion of the LEU-2 gene and from pMT479, 
see fig. 14. pMT608 is described in EP 195 691. pMT479 is de- 
scribed in EP 163 529. pMT479 contains the Schi zosa «nh aT -^ Y - 
5 ess. ESSbe TPI gene (POT) , the g. cerevisia* triosephosphate 
isomerase promoter and terminator, TPI P and TPI T : (Alber, T. 
and Kawasaki, G. J. Mol. Appl. Gen. 1 (1982), 419-434). Plas- 
mid pKFN374 contains the following sequence: 

TPIp-MFal-signal-leader (1-85) -aprotinin ( 3-58 ) -TPI T . 

10 where MFotl is the fi. cerevjsjae. mating factor alpha 1 coding 
sequence (Kurjan, J. and Herskowitz, I., Cell 23. (1982), 933- 
943), signal-leader (1-85) means that the sequence contains 
the first 85 amino acid residues of the MFctl signal-leader 
sequence and aprotinin (3 -58) is the synthetic sequence en- 

15 coding an aprotinin derivative lacking the first two amino 
acid residues. 

S. cerevisiae strain MT663 (E2-7B XE11-36 a/a, AtpiAtpi, pep 
4-3/pep 4-3) was grown on YPGaL (1% Bacto yeast extract, 2% 
Bacto peptone, 2% galactose, 1% lactate) to an O.D. at 600 nm 

20 of 0.6. 

100 ml of the resulting culture was harvested by centrifuga- 
tion, washed with 10 ml of water, recentrifuged and resus- 
pended in 10 ml of a solution containing 1.2 M sorbitol, 25 
mM Na 2 EDTA pH = 8.0, and 6.7 mg/ml dithiotreitol . The suspen- 

25 sion was incubated at 30 'C for 15 minutes, centrifuged and 
the cells resuspended in 10 ml of a solution containing 1.2 M 
sorbitol, 10 mM Na 2 EDTA, 0.1 m sodium citrate, pH - 5.8, and 
2 mg Novozym® 234. The suspension was incubated at 30 *C for 
30 minutes, the cells collected by centrifugation, washed in 

30 10 ml of 1.2 M sorbitol and 10 ml of CAS (1.2 M sorbitol, 10 
mM caci 2 , 10 mM Tris HCL (Tris . Tris (hydroxymethyl) amino me- 
thane) pH ■ 7.5) and resuspended in 2 ml of CAS. For trans- 
formation 0.1 ml of CAS-resuspended cells were mixed with ap- 
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proximately 1 jig of plasmid pKFN374 and left at room tempera- 
ture for 15 minutes. 1 ml of (20% polyethylenglycol 4000, 10 
mM CaCl 2 , 10 mM Tris HC1, pH = 7.5) was added and the mixture 
left for further 30 minutes at room temperature. The mixture 
5 was centrif uged and the pellet resuspended in o . 1 ml of SOS 
(1.2 M sorbitol, 33% v/v YPD, 6.7 mM CaCl 2 , 14 ng/ml leucine) 
and incubated at 30 °C for 2 hours. The suspension was then 
centrif uged and the pellet resuspended in 0.5 ml of 1.2 M 
sorbitol. Then, 6 ml of top agar (the SC medium of Sherman et 

10 al., (Methods in Yeast Genetics, Cold Spring Harbor Labora- 
tory, 1981) containing 1.2 M sorbitol plus 2.5% agar) at 52°c 
was added and the suspension poured on top of plates contain- 
ing the same agar-solidified, sorbitol containing medium. 
Transformant colonies were picked after 3 days at 30 °C, re- 

15 isolated and used to start liquid cultures, one such trans- 
formant KFN322 was selected for further characterization. 

Yeast strain KFN322 was grown on YPD medium (1% yeast ex- 
tract, 2% peptone (from Difco Laboratories), and 2% glucose). 
A 10 ml culture of the strain was shaken at 30 °c to an O.D. 

20 at 600 nm of 32. After centrifugation the supernatant was 
analyzed by FPLC ion exchange chromatography. The yeast 
supernatant was filtered through a 0.22 pm Milled gv filter 
unit and 1 ml was applied on a MonoS cation exchange column 
(0.5 x 5 cm) equilibrated with 20 aM Bicine, pH 8.7. After 

25 wash with equilibration buffer the column was eluted with a 
linear NaCl gradient (0-1 M) in equilibration buffer. Trypsin 
inhibitor activity was quantified in the eluted fractions by 
spectrophotometric assay and furthermore by integration of 
absorption at 280 nm from 

30 E 1% (aprotinin) « 8.3 

280 

The yield was about 3 mg/liter of aprotinin(3-58) . 
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Example 7 

Production o f the insulin precursor Bf 1-29 l-ftlaAlalArs-Af 1-21 
from veast s train L&C1667 

The 589bp Sphl-Ncol and the I72bp Hpal-Xbal fragments were 
5 isolated from pLaC212spx3 (described in WO 89/02463) . These 
fragments were joined with the synthetic adaptor 

MAKEL EK RFV 
NOR-962 CATGGCTAAGGAATTGGAAAAGAGATTCGTT 
NOR-9 64 CGATTCCTTAACCTTTTCTCTAAGCAA 

10 and the lOkb Xbal-SphI fragment from pMT743 (described in WO 
89/02463), resulting in the plasmid pLaC240 (see fig. 15). 
The DNA sequence of the modified leader-insulin precursor 
gene from pLaC240 is shown in fig. 16. 

Transformation of yeast strain MT663 with plasmid pLaC240 
15 gave rise to an insulin precursor secreting strain, LaC1667 
(MT663/pLaC240) , the productivity of which is 165% relative 
to the strain LaC1414 (MT663/pLaC212spx3) containing the gene 
for the unmodified leader-insulin precursor described in WO 
89/02463. 

20 Example 8 

Production of insulin analogue precursor Bfl-59 ,iG^m-^7fli n) - 
AlaAlaLvs-A( 1-211 from veast strain KFN-47n 

By ligation of 10 oligonucleotides a synthetic gene coding 
for the insulin analogue precursor B(l-29,lGlu+27Glu)- 

25 AlaAlaLys-A(l-21) was constructed. B(l-29,lGlu+27Glu) signi- 
fies the polypeptide containing the first 29 amino acid resi- 
dues of the B-chain of human insulin in which Glu residues 
have been substituted for the BIPhe and B27Thr residues. A(l- 
21) is the A-chain of human insulin. A tripeptide, AlaAlaLys, 

30 connects the B29Lys residue to the AlGly residue. 
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The following 10 oligonucleotides were synthesized: 

NOR-542: AAAGAGAAGTTAACCAACACTTGTGCGGTTCCCAC 
NOR-543: AACCAAGTGGGAACCGCACAAGTGTTGGTTAACTTC 
NOR-6 9 : TTGGTTGAAGCTTTGTACTTGGTTTGCGGTGAAAGAGGTTTCT 
5 NOR-73: GTAGAAGAAACCTCTTTCACCGCAAACCAAGTACAAAGCTTC 
NOR-3 15 : TCTACGAACCTAAGGCTGCTAAGGGTATTGCT 
NOR-3 16 : ATTGTTCGACAATACCCTTAGCAGCCTTACGTTC 
NOR-7 0 : GAACAATGCTGTACCTCCATCTGCTCCTTGTACCAAT 
TTTTCCAATTGGTACAAGGAGCAGATGGAGGTACAGC 
TGGAAAACTACTGCAACTAGACGCAGCCCGCAGGCT 
CTAGAGCCTGCGGGCTGCGTCTAGTTGCAGTAG 



NOR-7 1 
10 NOR-78 
NOR-7 2 



5 duplexes were formed from the above 10 oligonucleotides and 
the duplexes were ligated in an similar way as described in 
Example 1. 

15 The synthetic 178bp gene was ligated to the 330bp EcoRl-Hgal 
fragment from pKFN-9 encoding the S. cerevisiae mating factor 
alpha 1 signal-leader (1-85) sequence (Markussen, J. et al., 
Protein Engineering l (1987), 215-223) and to the 2.7kb 
EcoRI-Xbal fragment of plasmid pUCl9 (Yanisch-Perron, c, 

20 Vieira, J. and Messing, J., Gene 33 (1985), 103-119). 

* 

The ligation mixture was used to transform a competent E. 
eoli strain r", m + ) selecting for ampicillin resistance. Se- 
quencing of a 32 P-Xbal-EcoRl fragment (Maxam, A. and Gilbert, 
W., Methods Enzymol. £§ (1980), 499-560) showed that plasmids 
25 from the resulting colonies contained the correct DNA se- 
quence for the insulin analogue precursor. 

One plasmid pKFN-456 was selected for further use. 

By following the procedure of Example 1 a yeast expression 
pladmid pKFN-458 was obtained containing the following ex- 
30 pression cassette: 
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TPIp-MFal- signal -leader ( 1-85) -B ( 1-29 , 161u+27Glu) -AlaAlaLys- 
A(1-21)-TPI T . 

The DNA sequence of the 508bp EcoRI-Xbal fragment from pKFN- 
456 and pKFN-458 is given in Fig. 17. 

5 Plasmid pKFN-458 was transformed into yeast strain MT-663 as 
described above resulting in yeast strain KFN-470. 

Culturing of the transformed strain KFN-470 in YPD medium was 
performed as described above. The yield of insulin analogue 
precursor in the supernatant was determined by HPLC as de- 
10 scribed (L. Snel et al. f Chromatographia 2± (1987) , 329-332). 

In Table 2 the expression levels of the insulin analogue pre- 
cursors B(l-29,lGlu+27Glu)-AlaAlaLys-A(l-21) and B(l- 
29,27Glu)-AlaAlaLys-A(i-21) and the insulin precursor B(l- 
29)-AlaAlaLys-A(l-21) are compared. All three precursors 
15 were expressed in the same host strain, MT-663, transformed 
with the S. popbe TPI gene containing plasmids with the ex- 
pression cassette. 



TPlp-MFal-signal-leader ( 1-85 ) -precursor-TPI T . 
Table 2 

20 Expression levels of precursors of insulin and insn; 
oues in yeas t transformants 



analo- 



Precursor 

B(l-29) -AlaAlaLys-A(l-21) 
B(l-29) ,27Glu)-AlaAlaLys-A(l-21) 
25 B(l-29, lGlu+27Glu) -AlaAlaLys-A(l-21) 



Expression Level* 

100% 
148% 
479% 



*The expression levels are indicated as a percentage of the 
expression level of the insulin precursor B(l-29)-AlaAlaLys- 
A(l-21), which is arbitrarily set to 100%. 
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CLAIMS 

1. A polypeptide comprising a fusion of a signal peptide, a 
leader peptide and a heterologous protein or polypeptide, 
which polypeptide is modified in its amino acid sequence ad- 
5 jacent to a yeast processing site positioned between the C- 

terminal end of the leader peptide and the N-terminal end of 
the heterologous protein so as to provide a presentation of 
the processing site which makes it accessible to proteolytic 
cleavage, the polypeptide having the following structure 

10 signal peptide-leader peptide-X 1 -X 2 -X 3 -X 4 -heterologous protein 

wherein X 1 is a peptide bond or represents one or more amino 
acids which may be the same or different, 

X 2 and X 3 are the same or different and represent a basic 
amino acid selected from the group consisting of Lys and Arg, 
15 X 2 and X 3 together defining a yeast processing site, and 

X 4 is a peptide bond or represents one or more amino acids 
which may be the same or different, 

with the proviso that X 1 and/or X 4 represent one or more 
amino acids and that at least one of the amino acids repre- 
20 sented by X 1 and/or X 4 is a negatively charged amino acid se- 
lected from the group consisting of Glu and Asp, 

2. A polypeptide according to claim 1, wherein X 1 represents 
Glu or Asp. 

3. A polypeptide according to claim 1, wherein X 1 represents 
25 a sequence of two amino acids with the structure BA f wherein 

A is Glu or Asp, and B is Glu, Asp, Val, Gly or Leu. 

4. A polypeptide according to claim 1, wherein X 1 represents 
a sequence of three amino acids with the structure CBA, 
wherein A and B are as defined above, and C is Glu, Asp, Pro, 

30 Gly, Val, Leu, Arg or Lys. 
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5. A polypeptide according to claim 1, wherein X 1 represents 
a sequence of four amino acids with the structure DCBA, 
wherein A, B and C are as defined above, and D has the same 
meanings as C. 

5 6. A polypeptide according to claim 1, wherein X 1 represents 
a sequence of five amino acids with the structure EDCBA, 
wherein A, B, C and D are as defined above, and E has the 
same meanings as C. 

7. A polypeptide according to claim 1, wherein X 1 represents 
10 a sequence of six amino acids with the structure PEDCBA, 

wherein A, B, C, D and E are as defined above, and F has the 
same meanings as C. 

8. A polypeptide according to claim 1, wherein X 4 is Glu or 
Asp. 

15 9. A polypeptide according to claim 1, wherein X 4 represents 
a sequence of two amino acids with the structure AB, wherein 
A is Glu or Asp, and B is Glu, Asp, Val, Gly or Leu. 

10. A polypeptide according to claim 1, wherein X 4 represents 

a sequence of three amino acids with the structure ABC, 

20 wherein A and B are as defined above, and C is Glu, Asp, Pro] 
Gly, Val, Leu, Arg or Lys. 

11. A polypeptide according to claim 1, wherein X 4 represents 
a sequence of four amino acids with the structure ABCD, 
wherein A, B and C are as defined above, and D has the same 

25 meanings as C. 

12. A polypeptide according to claim 1, wherein X 4 represents 
a sequence of five amino acids with the structure ABCDE, 
wherein A, B, C and D are as defined above, and E has the 
same meanings as C. 



WO 90/10075 



PCT/DK90/00058 



38 

13. A polypeptide according to claim 1, wherein X 4 represents 
a sequence of six amino acids with the structure ABCDEF, 
wherein A, B, C, D and E are as defined above, and F has the 
same meanings as c. 

5 14. A polypeptide according to claim l f wherein, when X 1 
and/or X 4 represent >1 amino acid, the amino acid immediately 
adjacent to X 2 is Glu or Asp, the amino acid immediately 
adjacent to X 3 is Glu or Asp, or both are Glu or Asp. 

15. A polypeptide according to claim 1, wherein, when X 1 
10 and/or X 4 represent >l amino acid, either X 1 or X 4 or both 

comprise more than one Glu or Asp. 

16 . A polypeptide according to claim 1 , wherein X 1 and X 4 
both represent >l amino acid. 

17. A polypeptide according to claim 16, wherein X 1 and X 4 
15 are symmetrically identical. 

18. A polypeptide according to claim 1, wherein, when X 4 re- 
presents one or more amino acids, an additional processing 
site is provided between X 4 and the N-terminal end of the 

20 heterologous protein. 

19. A polypeptide according to claim 1, wherein the signal 
peptide is the a-f actor signal peptide, the signal peptide of 
mouse salivary amylase, the carboxypeptidase signal peptide, 
or the yeast BAR1 signal peptide. 

25 20. A polypeptide according to claim 1, wherein the leader 
peptide is a natural leader peptide such as the a-factor 
leader peptide. 

21. A polypeptide according to claim 1, wherein the leader 
peptide is a synthetic leader peptide such as 
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A. Ala - pr °- v al-Thr-Gly-Asp-Glu-Ser-Ser-Val-Glu-Ile- 

Pro-Glu-Glu-Ser-Leu-Ile-Gly-Phe-Leu-Asp-Leu-Ala- 
Gly-Glu-Glu-Ile-Ala-Glu-Asn-Thr-Thr-Leu-Ala 

B. Ala - p ro-Val-Thr-Gly-Asp-Glu-Ser-Ser-Val-Glu-Ile- 

5 Pr °- 6 l«-Glu-Ser-Leu-lle-ile-Ala-Glu-Asn-Thr-Thr- 

Leu-Ala 

C A la-Pro-Val-Thr-Gly-Asp-Glu-Ser-Ser-Val-Glu-ile- 

Pro-Ile-Ala-Glu-Asn-Thr-Thr- Leu-Ala 



10 



D. Ala -Pro-Val-Thr-Gly-Asp-Glu-Ser-Ser-Val-Glu-Ile- 

Pro-Glu-Glu-Ser-Leu-ile-lle-Ala-Glu-Asn-Thr-Thr 
Leu-Ala-Asn-Val-Ala-Met-Ala and 



Gln-Pro-Val-Thr-Gly-Asp-Glu-Ser-Ser-Val-Glu-ile- 

Pro-Glu-Glu-Ser-Leu-Ile-Ile-Ala-Glu-Asn-Thr-Thr- 
Leu-Ala-Asn-Val-Ala-Met-Ala 

15 22. A polypeptide according to claim 1, wherein the heterolo- 
gous protein or polypeptide is selected from the group con- 
sisting of aprotinin or other protease inhibitors, insulin 
and insulin precursors, human or bovine growth hormone, in- 
terleukin, tissue plasminogen activator, glucagon. Factor 

20 VII, Factor VIII, Factor XIII, platelet-derived growth fac- 
tor, enzymes, and a functional analogue of any of these pro- 
teins . 

23. A polypeptide according to claim 22, wherein the hetero- 
logous protein or polypeptide is aprotinin or a functional 

25 analogue thereof. 

24. A polypeptide according to claim 23, wherein X 1 is Glu- 
Arg-Leu-Glu or Lys-Glu-Leu-Glu. 

25. A polypeptide according to claim 23, wherein X 4 is Glu- 
Leu or Glu-Leu-Asp-Leu . 



WO 90/10075 



40 



PCT/DK90/000S8 



26. A polypeptide according to claim 22, wherein the hetero- 
logous protein or polypeptide is insulin or an insulin pre- 
cursor or a functional analogue thereof. 

27. A polypeptide according to claim 26, wherein X 1 is Glu- 
5 Arg-Leu-Glu or Lys-Glu-Leu-Glu. 

28. A polypeptide according to claim 26, wherein X 4 is Glu, 

29. A polypeptide according to any of claims 26-28, wherein 
the heterologous protein or polypeptide is the insulin ana- 
logue precursor B(l-29)-AlaAlaLys-A(l-29) , and wherein X 1 is 

10 Lys-Glu-Leu-Glu. 

30. A polypeptide according to any of claims 26-28, wherein 
the heterologous protein or polypeptide is the insulin ana- 
logue precursor B(l-29)-SerAspAspAlaLys-A(l-29) , and wherein 
X 1 is Lys-Glu-Leu-Glu. 

15 31. A DNA construct which comprises a DNA sequence encoding a 
polypeptide according to any of claims 1-30. 

32. A DNA construct according to claim 31, which comprises 
the DNA sequence shown in Fig. 4, 7, 9 or 11, or a suitable 
modification thereof. 

20 33. A recombinant expression vector which is capable of re- 
plicating in yeast and which carries a DNA construct accord- 
ing to claim 31 or 32. 

34- A yeast strain which is capable of expressing a heterolo- 
gous protein or polypeptide and which is transformed with a 
25 vector according to claim 33. 

35. A process for producing a heterologous protein or poly- 
peptide in yeast, comprising cultivating a yeast strain 
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according to claim 34 in a suitable medium to obtain expres- 
sion and secretion/processing of the heterologous protein or 
polypeptide, after which the protein or polypeptide is iso- 
lated from the medium. 

5 36. An aprotinin analogue of the general formula X 4 -apro- 
tinin(l-58), wherein X 4 represents an N-terminal extension 
by one or more amino acids at least one of which is a nega- 
tively charged amino acid selected from the group consisting 
of Glu and Asp. 

10 37. An analogue according to claim 36, wherein X 4 is Glu or 
Asp. 



38. An analogue according to claim 36, wherein X 4 represents 
a sequence of two amino acids with the structure AB, wherein 
A is Glu or Asp, and B is Glu, Asp, Val, Gly or Leu. 

15 39. An analogue according to claim 36, wherein X 4 represents 
a sequence of three amino acids with the structure ABC, 
wherein A and B are as defined above, and C is Glu, Asp, Pro, 
Gly, Val, Leu, Arg or Lys. 

40. An analogue according to claim 36, wherein X 4 represents 
20 a sequence of four amino acids with the structure ABCD, 

wherein A, B and C are as defined above, and D has the same 
meanings as C. 

41. An analogue according to claim 36, wherein X 4 represents 
a sequence of five amino acids with the structure ABCDE, 

25 wherein A, B, C and D are as defined above, and E has the 
same meanings as C. 



42. An analogue according to claim 36, wherein X 4 represents 
a sequence of six amino acids with the structure ABCDEF 
wherein A, B, C, D and E are as defined above, and F has the 
30 same meanings as C. 
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43. A polypeptide according to claim 36, wherein, when X 4 re- 
presents >1 amino acid, X 4 comprises more than one Glu or 
Asp. 

44. An analogue according to claim 36, wherein X 4 is Glu-Leu 
5 or Glu-Leu-Asp-Leu. 
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1 5 10 

MetAlaLysArgArgProAspPheCysLeuGluProProTyrThrGlv 
Ncol StuI J 



CATGGCCAAAAGAAGGCCTGATTTCTGTTTGGAACCTCCATACACTGGT 
CGGTTTTCTTCCGGACTAAAGACAAACCTTGGAGGTATGTGACCA 



15 20 25 

ProCysLysAlaArgllelleArgTyrPheTyrAsnAlaLysAlaGly 

Cc jATGTAA AGCTAGAATCATCAGATACTTCTACAACC jcCAAG GCTGGT 
GGTACAT^JrcGATCTTAGTAGTCTATGAAGATGTTGC^GGTTCjCGACCA 



30 35 40 

LeuCysGlnThrPheValTyrGlyGlyCysArgAlaLysArgAsnAsn 

TTGTGTCAAACTTTCGTTTACGGTGGCl lGCAGAGC TAAGAGAAACAAC 
AACACAGTTTGAAAGCAAATGCCACCGACGTCTCjGATTCTCTTTGTTG 



4 5 50 55 58 

PheLysSerAlaGluAspCysMetArgThrCysGlyGlyAlaStop 

TTCAAGljcTGCTGAAGACTGCATGAGAACTTGTGGTGGTGCCTAAl^^ 
AAGTTCAGACGAjCTTCTGACGTACTCTTGAACACCACCACGGATTAGATC 



Fig. 1 



WO 90/10075 



PCI7DK90/00058 



2/18 




Fig. 2 
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Fig. 3 
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10 20 30 40 50 60 

' I I ! i I 

GAATTCCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATCAATTTCATACACAAT 



70 8° 90 100 110 120 

1 ! I I | 

ATAAACGACCAAAAGAATGAAGGCTGTTTTCTTGGTTTTGTCCTTGATCGGATTCTGCTG 

METLysAlaValPheLeuValLeuSerLeuIleGlyPheCysTrp 
130 140 150 160 170 180 

1 •' : : : : 

GGCCCAACCAGTCACTGGCGATGAATCATCTGTTGAGATTCCGGAAGAGTCTCTGATCAT 
AlaGlnProValThrGlyAspGluSerSerValGluIleProGluGluSerLeuIlelle 

190 200 210 220 230 240 

> i i J | • 

CGCTGAAAACACCACTTTGGCTAACGTCGCCATGGCTAAGAGAGAATTGAGACCTGATTT 

AlaGluAsnThrThrLeuAlaAsnValAlaMETAlaLysArgGluLexJVrgProAspPhe 

250 2 ^0 270 280 290 300 

1 ! i ! ! I 

CTGTTTCGAACCTCCATACACTGGTCCATGTAAAGCTAGAATCATCAGATACTTCTACAA 

Cy s LeuGluProProTyrThrGly ProCy s LysAlaArgl le 1 leArgTy r PheTyr Asn 

310 3 2 <> 330 340 350 360 

1 i I ! 

CGCCAAGGCTGGTTTGTGTCAAACTTTCGTTTACGGTGGCTGCAGAGCTAAGAGAAACAA 

AlaLysAlaGlyLeuCysGlnThrPheValTyrGlyGlyCysArgAlaLysArgAsnAsn 

370 380 390 400 410 

CTTCAAGTCTGCTGAAGACTGCATGAGAACTTGTGGTGGTGCCTAATCTAGA 
PheLysSerAlaGluAspCysMETArgThrCysGlyGlyAla 



Fig. 4 
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Fig. 6 
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10 20 30 40 50 60 

::;::! 

GAATTCCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATCAATTTCATACACAAT 



7 <> 80 90 100 110 120 

i ! ! I I , 

ATAAACGACCAAAAGAATGAAGGCTGTTTTCTTGGTTTTGTCCTTGATCGGATTCTGCTG 

MET LysAl aVal PheLeuValLeuS er Leul leGlyPheCysTrp 
130 140 150 160 170 180 

::::!: 

GGCCCAACCAGTCACTGGCGATGAATCATCTGTTGAGATTCCGGAAGAGTCTCTGATCAT 
AlaGlnProValThrGlyAspGluSerSerValGluIleProGluGluSerLeuIlelle 

190 200 210 220 230 240 

i ! ! i I ! 

CGCTGAAAACACCACTTTGGCTAACGTCGCCATGGCTAAGAGAGAATTGGACTTGAGACC 

AlaGluAsnThrThrLeuAlaAsnValAlaMETAlaLysArgGluLeuAspLeuArgPro 

t 

250 260 270 280 290 300 

* I I I i ! 

TGATTTCTGTTTGGAACCTCCATACACTGGTCCATGTAAAGCTAGAATCATCAGATACTT 

AspPheCysLeuGluProProTyrThrGlyProCysLysAlaArgllelleArgTryrPhe 

310 320 330 340 350 360 

! I I I i i 

CTACAACGCCAAGGCTGGTTTGTGTCAAACTTTCGTTTACGGTGGCTGCAGAGCTAAGAG 

TyrAsnAlaLysAlaGlyLeuCysGlnThrPheValTyrGlyGlyCysArgAlaLysArg 

370 380 390 400 410 

j | • j j 

AAACAACTTCAAGTCTGCTGAAGACTGCATGAGAACTTGTGGTGGTGCCTAATCTAGA 
AsnAsnPheLysSerAlaGluAspCysMETArgThrCysGlyGlyAla 



Fig. 7 
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Fig. 8 
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10 20 30 40 50 60 

' I • 1 I I 

GAATTCCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATCAATTTCATACACAAT 



70 80 90 100 110 120 

i I I I ! 1 

ATAAACGACCAAAAGAATGAAGGCTGTTTTCTTGGTTTTGTCCTTGATCGGATTCTGCTG 

METLysAlaValPheLeuValLeuSerLeuIleGlyPheCysTrp 

130 140 150 160 170 180 

! I I ! { | 

GGCCCAACCAGTCACTGGCGATGAATCATCTGTTGAGATTCCGGAAGAGTCTCTGATCAT 

AlaGlnProValThrGlyAspGluSerSerValGluIleProGluGluSerLeuIlelle 

190 200 210 220 230 240 

I I I ! I 1 

CGCTGAAAACACCACTTTGGCTAACGTCGCCATGGCTAAGGAATTGGAGAAGAGAAGGCC 

AlaGluAsnThrThrLeuAlaAshValAlaMETAlaLysGluLeuGluLysArgArgPro 

t 

250 260 270 280 290 300 

i : : i i i 

TGATTTCTGTTTGGAACCTCCATACACTGGTCCATGTAAAGCTAGAATCATCAGATACTT 
AspPheCysLeuGluProProTyrThrGlyProCysLysAlaArgllelleArgTyrPhe 

310 320 330 340 350 360 

I I I I I I 

CTACAACGCCAAGGCTGGTTTGTGTCAAACTTTCGTTTACGGTGGCTGCAGAGCTAAGAG 

TyrAsnAlaLysAlaGlyLeuCysGlnThrPheValTyrGlyGlyCysArgAlaLysArg 

370 380 390 400 410 

i I I I I 

AAACAACTTCAAGTCTGCTGAAGACTGCATGAGAACTTGTGGTGGTGCCTAATCTAGA 

AsnAsnPheLysSerAlaGluAspCysMETArgThrCysGlyGlyAla 
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GAATTCCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATCAATTTCATACACAAT 



70 80 90 100 110 120 
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ATAAACGACCAAAAGAATGAAGGCTGTTTTCTTGGTTTTGTCCTTGATCGGATTCTGCTG 

METLysAlaValPheLeuValLeuSerLeuIleGlyPheCysTrp 
130 140 150 160 170 180 

::(::: 

GGCCCAACCAGTCACTGGCGATGAATCATCTGTTGAGATTCCGGAAGAGTCTCTGATCAT 
AlaGlnProValThrGlyAspGluSerSerValGluIleProGluGluSerLeuIlelle 

190 200 210 220 230 240 

: : i i : : 

CGCTGAAAACACCACTTTGGCTAACGTCGCCATGGCTGAGAGATTGGAGAAGAGAAGGCC 
AlaGluAsnThrThrLeuAlaAsnValAlaMETAl^GlviArgLeuGliiLysArgArgPro 

250 260 270 280 290 300 

111!!! 
TGATTTCTGTTTGGAACCTCCATACACTGGTCCATGTAAAGCTAGAATCATCAGATACTT 

AspPheCysLeuGluProProTyrThrGlyProCysLysAlaArgllelleArgTyrPhe 

310 320 330 340 350 360 

I ! I I I ! 

CTACAACGCCAAGGCTGGTTTGTGTCAAACTTTCGTTTACGGTGGCTGCAGAGCTAAGAG 

TyrAsnAlaLysAlaGlyLeuCysGlnThrPheValTyrGlyGlyCysArgAlaLysArg 

370 380 390 400 410 

1 lit! 
AAACAACTTCAAGTCTGCTGAAGACTGCATGAGAACTTGTGGTGGTGCCTAATCTAGA 

AsnAsnPheLysSerAlaGluAspCysMETArgThrCysGlyGlyAla 
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AspPheCysLeuGluProProTyrThrGlyProCysLysAlaArglle 
AAAGAGATTTCTGTTTGGAACCTCCATACACTGGTC CfiTGTAAA GCTAGAATC 
CTAAAGACAAACCTTGGAGGTATGTGACCAGGTACATTOCGATCTTAG 



20 25 30 35 

IleArgTyrPheTyrAsnAlaLysAlaGlyLeuCysGlnThrPheValTyrGly 
ATCAGATACTTCTACAACG fpCAAGG CTGGTTTGTGTCAAACTTTCGTTTACGGT 
TAGTCTATGAAGATGTTGCGGTTcbGACCAAACACAGTTTGAAAGCAAATGCCA 



40 45 50 

GlyCysArgAlaLysArgAsnAsnPheLysSerAlaGluAspCysMetArgThr 

GGCt )jCAGAGC TAAGAGAAACAACTTCAAGT ^TGCTG AAGACTGCATGAGAACT 
CCGACGTCTC^ATTCTCTTTGTTGAAGTTCAGACGAbTTCTGACGTACTCTTGA 



55 58 
Cy sG lyGlyAlaSt opXbal 

TGTGGTGGTGCCTAAT 
ACACCACCACGGATTAGATC 
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Fig. 15 
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GAATTCCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATCAATTTCATACACAAT 



70 80 90 100 110 120 
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ATAAACGACCAAAAGAATGAAGGCTGTTTTCTTGGTTTTGTCCTTGATCGGATTCTGCTG 

METLysAlaValPheLeuValLeuSerLeuXleGlyPheCysTrp 
130 140 150 160 • 170 180 

1 ' > ii i 

GGCCCAACCAGTCACTGGCGATGAATCATCTGTTGAGATTCCGGAAGAGTCTCTGATCAT 
AlaGlnProValThrGlyAspGluSerSerValGluIleProGluGluSerLeuIlelle 

190 200 210 220 230 240 

• ' I I J J 

CGCTGAAAACACCACTTTGGCTAACGTCGCCATGGCTAAGGAATTGGAAAAGAGATTCGT 

AlaGluAsnThrThrLeuAlaAsnValAlaMETAlaLysGluLeuGluLysArgPheVal 

t 

250 260 270 280 290 300 

' ' ' I I I 

TAACCAACACTTGTGCGGTTCCCACTTGGTTGAAGCTTTGTACTTGGTTTGCCGTGAAAG 

AsnGlnHisLeuCysGlySerHisLeuValGluAlaLeuTyrLeuValCysGlyGluArg 

310 320 330 340 350 360 

' ' I II i 

AGGTTTCTTCTACACTCCTAAGGCTGCTAAGGGTATTGTCGAACAATGCTGTACCTCCAT 

GlyPhePheTyrThrProLysAlaAlaLysGlylleValGluGlnCysCysThrSerlle 

370 380 390 400 410 

I I 1 I I 

CTGCTCCTTGTACCAATTGGAAAACTACTGCAACTAGACGCAGCCCGCAGGCTCTAGA 

CysSerLeuTyrGlnLeuGluAsnTyrCysAsn 
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CAATTCCATTCAAGAATACTTCAAACAAGAACATTACAAACTATCAXTTPCATACACXAT 

70 80 90 100 no 120 
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ATAAACGATTAAAAGAATGACATTTCCTTCAATTTTTACTCCACTTTTATTCCCAGCATC 



METArgPheProSerllePheThrAlaValLeuPheAl 



aAlaSer 



a 



130 140 150 160 170 180 

CTCCGCATTAGCTGCTCCAGTCAACACTACAACAGAAGATGAAACCGCACAAATTCCGGC 
SerAlaLeuAlaAlaProValAsnThrThrThrGluAspGluThrAlaGlnlleProAli 

190 200 210 220 230 240 

TGAAGCTGTCATCGGTTACTTAGATTTAGAAGGGGATTTCGATGTTCCTGTTTTGCCATT 
GluAlaValIleGlyTyrLeuAspLeuGluGlyAspPheAspValAlaValLeuProPh< 

250 260 270 280 290 300 

! 1 ' I I ! 

TTCCAACAGCACAAATAACCGGTTATTGTTTATAAATACTACTATTGCCAGCATTGCTGC 

SerAsnSerThrAsnAsnGlyLeuLeuPhelleAsnThrThrlleAlaSerlleAlaAl 
310 320 330 340 350 360 

' i i i ] I 

TAAAGAAGAACGGGTATCTTTGGATAAAAGAGAAGTTAACCAACACTTGTGCGCTTCCCA 

LysGluGluGlyValSerLeuAspLysArgGluValAsnGlnHisLeuCysGlySerHis 

t 

370 380 390 400 410 420 

• • I Si* 

CTTGGTTGAACCTTTGTACTTCGTTTGCGGTGAAAGAGGTTTCTTCTACGAACCTAAGGC 

LeuValGluAlaLeuTyrLeuValCysGlyGluArgGlyPhePheTyrGluProLysAla 

430 440 450 460 470 480 

1 1 I I I I 

TGCTAAGGGTATTGTCGAACAATGCTGTACCTCCATCTGCTCCTTGTACCAATTGGAAAA 

AlaLysGlylleValGluGlnCysCysThrSerlleCysSerLeuTyrGlnLeuGluAsn 

490 500 510 

I S I 

CTACTGCAACTAGACCCAGCCCGCAGGCTCTAGA 

TyrCysAsn Fig. 17 
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